A normal prior is reasonable for B0 and B1 because these can take values around a center value (positive and negative or even zero).
A normal prior isn’t reasonable for \(\sigma\) because sigma can’t be negative.
The difference between weakly informative priors and vague priors is that weakly informative priors are more focused in that they reflect uncertainty
across a range of sensible parameter values, whereas vague priors put weight on non-sensible parameter values. With weakly informative priors, chains don’t have to spend time exploring non-sensible parameter values.
Predictor = X and response variable = Y
Notice these have an order i.e. a person’s carbon footprint can’t happen before they commute.
Y = B0 + B1X where B0 is y-intercept (what typical Y value is when and X = 0) and B1 is slope. X indicates predictor.
Y_kangheight = B_0 + X_ageB_1 Here B_0 is the height of the kangaroo at 0 months old or when it’s born and B_1 is how much the kangaroo grows for each increase in month. B_1 is positive.
Y_Githubfollowers = B_0 + X_commitsB_1 Here B_0 is the number of followers when the data scientists has 0 Github commits and B_1 is how many followers are added for each increase in Github commits. Assuming B_1 is positive.
Y_visitors = B_0 + X_rainfallB_1 B_0 is the number of visitors to a local park when there are 0 inches of rainfall. B_1 is the decrease in visitors for each additional inches of rainfall. B_1 is negative.
Y_netflix = B_0 + X_sleepB_1 B_0 is the daily hours of Netflix a person watches when they have 0 hours of sleep. B_1 is the decrease in netflix hours for each increase in the hours of sleep. B_1 is negative.
\(\sigma\) tells us about the spread of data in our sample. More spread or a higher standard deviation means that the relationship between X and Y is weaker. Less spread or lower variation means the relationship between X and Y is stronger.
# Load packages
library(bayesrules)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.5 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.0.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.21.3, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
##
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
##
## extract
library(rstanarm)
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores())
##
## Attaching package: 'rstanarm'
## The following object is masked from 'package:rstan':
##
## loo
library(bayesplot)
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
## * Does _not_ affect other ggplot2 plots
## * See ?bayesplot_theme_set for details on theme setting
library(tidybayes)
library(janitor)
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
library(broom.mixed)
b&c) data: \(Y_{i}\)|\(B_O\),\(B_1\), \(\sigma\) ~ N(\(u_i\), \(\sigma\)^2) with \(u_i\) = \(B_O\) + \(B_1\)\(X_i\)
B0 ~ N(m0, s^2)
B1 ~ N(m1, s1^2)
\(\sigma\) ~ Exp(l)
The unknown parameters are BO, B1, and \(\sigma\). B1 can be 0 if we include infants under the age of 1 but a more realistic value is between 1-78 years of age with an average of 38 and sd of 13 years; B0 has to be positive (can’t drink negative amount of orange juice). \(\sigma\) has to be positive.
Let’s assume that the capacity to drink OJ increases as we increase in age because of growing stomach capacity. Drinking OJ could have a normal distribution where younger and older people drink less than middle aged people.
Let’s say that the average amount of gallons is 20 (around 320 cups a year). let’s say the amount of orange juice people drink per year has standard deviation of 2 gallons.
For the age assumptions I used average age, life expectancy, and sd from google searches. For the orange juice priors I based on my personal habits of drinking between 0.5-1.5 cups of OJ in a day and multiplied by 365 to get around 20 gallons a year.
plot_normal(mean = 20, sd = 2) + #average of .05 gallons of oj for B1
labs(x = "beta_0c", y = "pdf")
plot_normal(mean = 38, sd = 13) + #average age and sd of 13 years
labs(x = "beta_1", y = "pdf")
plot_gamma(shape = 1, rate = 0.07692307692) + #exponential model of sd 1/13 gives l of 0.07692307692
labs(x = "sigma", y = "pdf")
Plugging in the tuned priors we can specificy the Bayesian regression model like this: \(Y_{i}\)|\(B_O\),\(B_1\), \(\sigma\) ~ N(\(u_i\), \(\sigma\)^2) with \(u_i\) = \(B_O\) + \(B_1\)\(X_i\)
B0 ~ N(20, 4)
B1 ~ N(38, 169)
\(\sigma\) ~ Exp(0.0769)
b &c) data: \(Y_{i}\)|\(B_O\),\(B_1\), \(\sigma\) ~ N(\(u_i\), \(\sigma\)^2) with \(u_i\) = \(B_O\) + \(B_1\)\(X_i\)
B0 ~ N(m0, s^2)
B1 ~ N(m1, s1^2)
\(\sigma\) ~ Exp(l)
TThe unknown parameters are BO, B1, and \(\sigma\). B1 can between 0 and 56 degrees celcius (highest temp ever recorded was 134). B0 has to be positive and also in the same range as B1 since they’re both temperatures. \(\sigma\) has to be positive.
I’ll assume it’s fall so the average of today’s temperature and tomorrow’s is 15 degrees Celsius. I’ll assume a sdev of 3 degrees Celsius. I’ll use these values to build a model.
plot_normal(mean = 15, sd = 3) + #average of 15 degrees for B1
labs(x = "beta_0c", y = "pdf")
plot_normal(mean = 15, sd = 3) + #average temp and sd of 3 degrees celsius
labs(x = "beta_1", y = "pdf")
plot_gamma(shape = 1, rate = 0.333333) + #exponential model of sd 1/3 gives l of 0.333
labs(x = "sigma", y = "pdf")
My tuned model is now: data: \(Y_{i}\)|\(B_O\),\(B_1\), \(\sigma\) ~ N(\(u_i\), \(\sigma\)^2) with \(u_i\) = \(B_O\) + \(B_1\)\(X_i\)
B0 ~ N(15, 9)
B1 ~ N(15, 9)
\(\sigma\) ~ Exp(0.333)
False; MCMC is used to approximate the posterior.
True.
Specify appropriate stan_glm() syntax for simulating the Normal regression model using 4 chains each of length 10000 (don’t run the code).
bunnies_model <- stan_glm(height ~ age, data = bunnies, family = gaussian, prior_intercept = normal(m0, s0^2), prior = normal(m1, s1^2), prior_aux = exponential(1/s1), chains = 4, iter = 5000*2, seed = 800)
songs_model <- stan_glm(clicks ~ snaps, data = songs, family = gaussian, prior_intercept = normal(m0, s0^2), prior = normal(m1, s1^2), prior_aux = exponential(1/s1), chains = 4, iter = 5000*2, seed = 801)
happy_model <- stan_glm(happiness ~ age, data = dogs, family = gaussian, prior_intercept = normal(m0, s0^2), prior = normal(m1, s1^2), prior_aux = exponential(1/s1), chains = 4, iter = 5000*2, seed = 800)
B0 ~ N(5000,2000) based on using range rule (max-min/4) we can say sdev is 2000.
B1 ~ N(10, 5) based on range rule sdev is 20/4 so about 5
\(\sigma\) ~ Exp(0.2)
# Load and plot data
data(bikes)
ggplot(bikes, aes(x = humidity, y = rides)) +#taking a look at rides and humidity
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
#simulating the prior
bike_modelp <- stan_glm(rides ~ humidity, data = bikes,
family = gaussian,
prior_intercept = normal(5000, 2000),
prior = normal(10, 5),
prior_aux = exponential(0.2),
prior_PD = TRUE, #Including instructions to find prior
chains = 5, iter = 4000*2, seed = 855)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000115 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.15 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.349637 seconds (Warm-up)
## Chain 1: 0.072067 seconds (Sampling)
## Chain 1: 0.421704 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 3e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.222529 seconds (Warm-up)
## Chain 2: 0.063495 seconds (Sampling)
## Chain 2: 0.286024 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.3728 seconds (Warm-up)
## Chain 3: 0.052554 seconds (Sampling)
## Chain 3: 0.425354 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.323509 seconds (Warm-up)
## Chain 4: 0.0464 seconds (Sampling)
## Chain 4: 0.369909 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 3e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.232959 seconds (Warm-up)
## Chain 5: 0.059213 seconds (Sampling)
## Chain 5: 0.292172 seconds (Total)
## Chain 5:
#visualizing the chains and density plot
# Trace plots of parallel chains
mcmc_trace(bike_modelp, size = 0.1)
# Density plots of parallel chains
mcmc_dens_overlay(bike_modelp)
# Prior summary statistics
tidy(bike_modelp, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.80)
## # A tibble: 3 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 4391. 2000. 1787. 6936.
## 2 humidity 10.0 4.97 3.63 16.5
## 3 sigma 3.47 5.05 0.524 11.6
#neff and rhat
neff_ratio(bike_modelp)
## (Intercept) humidity sigma
## 0.74750 0.75005 0.93155
rhat(bike_modelp)
## (Intercept) humidity sigma
## 0.9999681 1.0004732 1.0001440
The r-hat is around 1 which suggests chains are stable.
# 100 simulated model lines
bikes |>
add_fitted_draws(bike_modelp, n = 100) |>
ggplot(aes(x = humidity, y = rides)) + #plotting humidity and rides
geom_line(aes(y = .value, group = .draw), alpha = 0.15) +
geom_point(data = bikes, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
# Simulate four sets of data
bikes |>
add_predicted_draws(bike_modelp, n = 4) |>
ggplot(aes(x = humidity, y = rides)) + #changed froom textbook to humidity
geom_point(aes(y = .prediction, group = .draw), size = 0.2) +
facet_wrap(~ .draw)
## Warning:
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").
# Load and plot data
data(bikes)
ggplot(bikes, aes(x = humidity, y = rides)) +#taking a look at rides and humidity
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
#a)
#simulating the posterior
bike_model <- stan_glm(rides ~ humidity, data = bikes,
family = gaussian,
prior_intercept = normal(5000, 2000),
prior = normal(10, 5),
prior_aux = exponential(0.2),
chains = 5, iter = 4000*2, seed = 850)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 5.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.57 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.151774 seconds (Warm-up)
## Chain 1: 0.204536 seconds (Sampling)
## Chain 1: 0.35631 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.13452 seconds (Warm-up)
## Chain 2: 0.197424 seconds (Sampling)
## Chain 2: 0.331944 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 9e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.316109 seconds (Warm-up)
## Chain 3: 0.201618 seconds (Sampling)
## Chain 3: 0.517727 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 7e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.173629 seconds (Warm-up)
## Chain 4: 0.198541 seconds (Sampling)
## Chain 4: 0.37217 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 6e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.127507 seconds (Warm-up)
## Chain 5: 0.201163 seconds (Sampling)
## Chain 5: 0.32867 seconds (Total)
## Chain 5:
#b)
# Trace plots of parallel chains
mcmc_trace(bike_model, size = 0.1)
# Density plots of parallel chains
mcmc_dens_overlay(bike_model)
# Posterior summary statistics
tidy(bike_model, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.80)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 3543. 198. 3290. 3798.
## 2 humidity -0.931 2.96 -4.76 2.91
## 3 sigma 1281. 30.7 1242. 1321.
## 4 mean_PPD 3484. 82.0 3379. 3588.
#neff and rhat
neff_ratio(bike_model)
## (Intercept) humidity sigma
## 0.94920 0.95990 0.96335
rhat(bike_model)
## (Intercept) humidity sigma
## 1.0000961 1.0000708 0.9999036
The MCMC chains look like fuzzy caterpillars (stable). Rhat is close to 1 meaning chains are stable. The MCMC approximation shows a humidity CI between -4.755891 and 2.907006; the prior had a CI of 3.62-16.488. The posterior relationship is 3543.143 + -.930773X. For every increase in humidity we expect riders to decrease by about 1.Slope can range between -4.755891 and 2.907006.
#posterior simulation
bike_default_post <- update(bike_modelp, prior_PD = FALSE)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.235376 seconds (Warm-up)
## Chain 1: 0.200635 seconds (Sampling)
## Chain 1: 0.436011 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.284317 seconds (Warm-up)
## Chain 2: 0.200924 seconds (Sampling)
## Chain 2: 0.485241 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.18582 seconds (Warm-up)
## Chain 3: 0.198543 seconds (Sampling)
## Chain 3: 0.384363 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 6e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.136082 seconds (Warm-up)
## Chain 4: 0.202861 seconds (Sampling)
## Chain 4: 0.338943 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 4e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.131508 seconds (Warm-up)
## Chain 5: 0.202714 seconds (Sampling)
## Chain 5: 0.334222 seconds (Total)
## Chain 5:
# 100 post model lines
bikes |>
add_fitted_draws(bike_default_post, n = 100) |>
ggplot(aes(x = humidity, y = rides)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
Compared to the prior, the posterior model has many more lines with a negative slope. Both the prior and posterior show a lot of variability among the lines.
#a)
# Posterior summary statistics
tidy(bike_model, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.95) #.95 credible interval
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 3543. 198. 3154. 3934.
## 2 humidity -0.931 2.96 -6.86 4.92
## 3 sigma 1281. 30.7 1223. 1342.
## 4 mean_PPD 3484. 82.0 3324. 3644.
The posterior median value of sigma (1281) means that values of Y (riders) will vary around the mean by this standard deviation.
The 95% posteior credible interval for humidity is between -6.859 and 4.92 meaning most values of humidity will be in this interval for the most probable lines.
Yes, we have a negative estimate for humidity and negative values in the 95% CI.
Without using the posterior_predict function model (1) the posterior model for the typical number of riders on 90% humidity days; and (2) the posterior predictive model for the number of riders tomorrow.
bike_model_df <- as.data.frame(bike_model) #storing 4 chains from posterior model as data frame
first_set <- head(bike_model_df, 1)
first_set
## (Intercept) humidity sigma
## 1 3729.559 -2.95844 1284.06
mu = BO^1 + B1^1X = 3729.559 + -2.95844X
mu <- first_set$'(Intercept)' + first_set$humidity * .90 #finding average of mu when humidity is .9
Mu is 3726.8960
set.seed(853)
y_new <- rnorm(1, mean = mu, sd = first_set$sigma)
mu
## [1] 3726.896
y_new is 3264.8866 Notice this is below average rides on days with humidity of .90.
# Predict rides for each parameter set in the chain
set.seed(853)
predict_90 <- bike_model_df |>
mutate(mu = `(Intercept)` + (humidity*.90),
y_new = rnorm(20000, mean = mu, sd = sigma))
#see predictions
head(predict_90, 5)
## (Intercept) humidity sigma mu y_new
## 1 3729.559 -2.9584399 1284.060 3726.896 3264.887
## 2 3390.540 0.6309655 1289.694 3391.108 3338.685
## 3 3634.177 -2.3585453 1300.529 3632.054 1314.300
## 4 3502.248 -0.6783358 1277.923 3501.637 2067.268
## 5 3808.408 -5.0733243 1262.807 3803.842 2537.519
# Construct 95% posterior credible intervals
predict_90 |>
summarize(lower_mu = quantile(mu, 0.025),
upper_mu = quantile(mu, 0.975),
lower_new = quantile(y_new, 0.025),
upper_new = quantile(y_new, 0.975))
## lower_mu upper_mu lower_new upper_new
## 1 3158.124 3928.692 1033.441 6077.826
Used view to see what the variable for tomorrow (Sunday) is and it’s sun. Built a posterior model for Sunday.
#see how often Sunday appears in data
bikes |> filter(day_of_week=="Sun") |>
summarize(Sun=n())
## Sun
## 1 67
#create dataset with just Sunday data
sunday_rides <- bikes |>
filter(day_of_week == "Sun")
#build posterior model for Sunday First I’ll take a look at rides on Sunday.
# Load and plot data
data(sunday_rides)
## Warning in data(sunday_rides): data set 'sunday_rides' not found
ggplot(sunday_rides, aes(x = day_of_week, y = rides)) +#taking a look at rides and day_of_week
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
#will also take a look at data without Sunday filter
# Load and plot data
data(bikes)
ggplot(bikes, aes(x = day_of_week, y = rides)) +#taking a look at rides and humidity
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
#a)
#simulating the posterior
sun_model <- stan_glm(rides ~ humidity, data = sunday_rides, #using sunday only data
family = gaussian,
prior_intercept = normal(5000, 2000), #using same prior
prior = normal(10, 5),
prior_aux = exponential(0.2),
chains = 5, iter = 4000*2, seed = 830)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.129492 seconds (Warm-up)
## Chain 1: 0.098592 seconds (Sampling)
## Chain 1: 0.228084 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 8e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.119594 seconds (Warm-up)
## Chain 2: 0.094977 seconds (Sampling)
## Chain 2: 0.214571 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.174715 seconds (Warm-up)
## Chain 3: 0.095803 seconds (Sampling)
## Chain 3: 0.270518 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.16979 seconds (Warm-up)
## Chain 4: 0.090177 seconds (Sampling)
## Chain 4: 0.259967 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 4e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.213454 seconds (Warm-up)
## Chain 5: 0.100153 seconds (Sampling)
## Chain 5: 0.313607 seconds (Total)
## Chain 5:
#posterior simulation
sunday_post <- update(sun_model, prior_PD = FALSE)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.12794 seconds (Warm-up)
## Chain 1: 0.09588 seconds (Sampling)
## Chain 1: 0.22382 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.122442 seconds (Warm-up)
## Chain 2: 0.095423 seconds (Sampling)
## Chain 2: 0.217865 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.175659 seconds (Warm-up)
## Chain 3: 0.096881 seconds (Sampling)
## Chain 3: 0.27254 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.168979 seconds (Warm-up)
## Chain 4: 0.090212 seconds (Sampling)
## Chain 4: 0.259191 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 4e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.212246 seconds (Warm-up)
## Chain 5: 0.100516 seconds (Sampling)
## Chain 5: 0.312762 seconds (Total)
## Chain 5:
# 5 post model lines
sunday_rides |>
add_fitted_draws(sunday_post, n = 5) |>
ggplot(aes(x = day_of_week, y = rides)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
sunday_df <- as.data.frame(sun_model)
#storing 4 chains from posterior model as data frame
set.seed(30)
second_set <- head(sunday_df, 1)
second_set
## (Intercept) humidity sigma
## 1 2272.662 10.96081 659.3914
Based on these 4 chains mu_sun = 2272.662 + 10.96X
mu_sun <- second_set$'(Intercept)'+second_set$humidity*.90 #finding average of mu when humidity is 90 on Sunday
mu_sun
## [1] 2282.527
y_newsun <- rnorm(1, mean = mu, sd = second_set$sigma)
mu
## [1] 3726.896
# Predict rides for each parameter set in the chain
set.seed(25)
predict_90v2 <- sunday_df |>
mutate(mu = `(Intercept)` + (humidity*.90),
y_newsun = rnorm(20000, mean = mu, sd = sigma))
#see predictions
head(predict_90v2, 5)
## (Intercept) humidity sigma mu y_newsun
## 1 2272.662 10.960806 659.3914 2282.527 2142.846
## 2 1892.379 16.683833 642.1061 1907.394 1238.582
## 3 2640.119 1.599782 694.9234 2641.559 1840.098
## 4 2072.064 12.684397 683.5994 2083.480 2303.279
## 5 2105.500 11.757314 661.2924 2116.081 1124.057
# Trace plots of parallel chains
mcmc_trace(bike_model, size = 0.1)
# Density plots of parallel chains for first posterior of .9 humidity
mcmc_dens_overlay(bike_model)
# Trace plots of parallel chains for Sunday's posterior
mcmc_trace(sun_model, size = 0.1)
# Density plots of parallel chains
mcmc_dens_overlay(sun_model)
The density plots for both posteriors look stable.
# Construct 80% posterior credible intervals for Sunday data
predict_90v2 |>
summarize(lower_mu = quantile(mu, 0.20),
upper_mu = quantile(mu, 1),
lower_new = quantile(y_new, 0.20),
upper_new = quantile(y_new, 1.0))
## lower_mu upper_mu lower_new upper_new
## 1 2186.17 3359.436 3264.887 3264.887
#using tidy to find credible interval for tomorrow
tidy(sun_model, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.80)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 2379. 240. 2069. 2690.
## 2 humidity 7.95 3.67 3.22 12.7
## 3 sigma 672. 29.3 636. 711.
## 4 mean_PPD 2874. 117. 2724. 3024.
The 80% interval for humidity (the slope) means humidity can be anywhere between 3.22 to 12.67.The same as for the most credible values of the intercept and sigma.
set.seed(222)
fast_prediction <-
posterior_predict(sun_model, newdata = data.frame(humidity = 0.90))
# Construct a 80% posterior credible interval
posterior_interval(fast_prediction, prob = 0.80)
## 10% 90%
## 1 1466.76 3293.547
# Plot the approximate predictive model
mcmc_dens(fast_prediction) +
xlab("predicted ridership on a Sunday")
The posterior_predict shows an 80% credible interval for the number of riders to be between 1466 and 3293 on a Sunday when humidity is 90%.
data(bikes)
ggplot(bikes, aes(x = windspeed, y = rides)) + #taking a look at rides and windspeed to think of good priors
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
B0 ~ N(6500,1625) based on using plausible (max-min/4) we can say sdev is approx. (6500-0/4)
B1 ~ N(40, 10) based on range rule sdev is 40/4 so about 10
\(\sigma\) ~ Exp(0..10) found by 1/stdev of B1
prior_summary(bike_model)
## Priors for model 'bike_model'
## ------
## Intercept (after predictors centered)
## ~ normal(location = 5000, scale = 2000)
##
## Coefficients
## ~ normal(location = 10, scale = 5)
##
## Auxiliary (sigma)
## ~ exponential(rate = 0.2)
## ------
## See help('prior_summary.stanreg') for more details
wind_model_prior <- stan_glm(
rides ~ windspeed, data = bikes,
family = gaussian,
prior_intercept = normal(6500, 2.5, autoscale = TRUE),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = TRUE,
chains = 4, iter = 5000*2, seed = 840)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.887273 seconds (Warm-up)
## Chain 1: 0.061957 seconds (Sampling)
## Chain 1: 0.94923 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 7e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.474978 seconds (Warm-up)
## Chain 2: 0.064498 seconds (Sampling)
## Chain 2: 0.539476 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 6e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.532923 seconds (Warm-up)
## Chain 3: 0.062756 seconds (Sampling)
## Chain 3: 0.595679 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 3e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.507126 seconds (Warm-up)
## Chain 4: 0.094305 seconds (Sampling)
## Chain 4: 0.601431 seconds (Total)
## Chain 4:
# 100 prior model lines with wind speed
bikes |>
add_fitted_draws(wind_model_prior, n = 100) |>
ggplot(aes(x = windspeed, y = rides)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
# 4 prior simulated datasets
set.seed(30)
bikes |>
add_predicted_draws(wind_model_prior, n = 4) |>
ggplot(aes(x = windspeed, y = rides)) +
geom_point(aes(y = .prediction, group = .draw)) +
facet_wrap(~ .draw)
## Warning:
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").
Many of the 100 lines do reflect our prior understanding that rides decrease as windspeed increases, but there is variability. Half the datasets look similar and have a negative relationship.
wind_model <- stan_glm(
rides ~ windspeed, data = bikes,
family = gaussian,
prior_intercept = normal(6500, 2.5, autoscale = TRUE),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
chains = 4, iter = 5000*2, seed = 840)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.278443 seconds (Warm-up)
## Chain 1: 0.235966 seconds (Sampling)
## Chain 1: 0.514409 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.148448 seconds (Warm-up)
## Chain 2: 0.250361 seconds (Sampling)
## Chain 2: 0.398809 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 5e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.330546 seconds (Warm-up)
## Chain 3: 0.249576 seconds (Sampling)
## Chain 3: 0.580122 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.365896 seconds (Warm-up)
## Chain 4: 0.2512 seconds (Sampling)
## Chain 4: 0.617096 seconds (Total)
## Chain 4:
set.seed(222)
fast_prediction_wind <-
posterior_predict(wind_model, newdata = data.frame(windspeed = 20))
# Construct a 80% posterior credible interval
posterior_interval(fast_prediction_wind, prob = 0.80)
## 10% 90%
## 1 1079.755 5079.995
# Plot the approximate predictive model
mcmc_dens(fast_prediction) +
xlab("predicted ridership by windspeed")
tidy(wind_model, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.80)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 4205. 178. 3980. 4435.
## 2 windspeed -55.6 12.6 -71.9 -39.4
## 3 sigma 1547. 49.1 1486. 1613.
## 4 mean_PPD 3482. 97.7 3358. 3608.
The posterior understanding matches my prior understanding that windspeed and rides is negative.
pen_model_prior <- stan_glm(
flipper_length_mm ~ bill_length_mm, data = penguins_bayes,
family = gaussian,
prior_intercept = normal(45, 2.5, autoscale = TRUE),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = TRUE, #making sure it's the prior
chains = 4, iter = 5000*2, seed = 83)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.065124 seconds (Warm-up)
## Chain 1: 0.063521 seconds (Sampling)
## Chain 1: 0.128645 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 3e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.075268 seconds (Warm-up)
## Chain 2: 0.078732 seconds (Sampling)
## Chain 2: 0.154 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 6e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.068583 seconds (Warm-up)
## Chain 3: 0.082065 seconds (Sampling)
## Chain 3: 0.150648 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 3e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.066846 seconds (Warm-up)
## Chain 4: 0.063715 seconds (Sampling)
## Chain 4: 0.130561 seconds (Total)
## Chain 4:
# 100 prior model lines with wind speed
penguins_bayes |> drop_na() |> #added drop_na() so code could run
add_fitted_draws(pen_model_prior, n = 100) |>
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
prior_summary(pen_model_prior)
## Priors for model 'pen_model_prior'
## ------
## Intercept (after predictors centered)
## Specified prior:
## ~ normal(location = 45, scale = 2.5)
## Adjusted prior:
## ~ normal(location = 45, scale = 35)
##
## Coefficients
## Specified prior:
## ~ normal(location = 0, scale = 2.5)
## Adjusted prior:
## ~ normal(location = 0, scale = 6.4)
##
## Auxiliary (sigma)
## Specified prior:
## ~ exponential(rate = 1)
## Adjusted prior:
## ~ exponential(rate = 0.071)
## ------
## See help('prior_summary.stanreg') for more details
Using information from above below is the written prior model \(Y_{i}\)|\(B_O\),\(B_1\), \(\sigma\) ~ N(\(u_i\), \(\sigma\)^2) with \(u_i\) = \(B_O\) + \(B_1\)\(X_i\)
B0 ~ N(45,35)
B1 ~ N(0, 6.4) based on range rule sdev is 40/4 so about 10
\(\sigma\) ~ Exp(0.071)
# 100 prior model lines
penguins_bayes |> drop_na() |>
add_fitted_draws(pen_model_prior, n = 100) |>
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
# 4 prior simulated datasets
set.seed(40)
penguins_bayes |> drop_na() |> #added drop_na()
add_predicted_draws(pen_model_prior, n = 4) |>
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_point(aes(y = .prediction, group = .draw)) +
facet_wrap(~ .draw)
## Warning:
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").
data("penguins_bayes")
ggplot(penguins_bayes, aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_point(size = 0.1) +
geom_line(method = "lm", se = FALSE)
## Warning: Ignoring unknown parameters: method, se
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 2 row(s) containing missing values (geom_path).
pen_model <- stan_glm(
flipper_length_mm ~ bill_length_mm, data = penguins_bayes,
family = gaussian,
prior_intercept = normal(51, 2.5, autoscale = TRUE),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 83)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.09767 seconds (Warm-up)
## Chain 1: 0.194162 seconds (Sampling)
## Chain 1: 0.291832 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.096185 seconds (Warm-up)
## Chain 2: 0.201471 seconds (Sampling)
## Chain 2: 0.297656 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.097709 seconds (Warm-up)
## Chain 3: 0.20419 seconds (Sampling)
## Chain 3: 0.301899 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.095046 seconds (Warm-up)
## Chain 4: 0.199509 seconds (Sampling)
## Chain 4: 0.294555 seconds (Total)
## Chain 4:
# 100 posterior model lines with wind speed
penguins_bayes |> drop_na() |> #added drop_na() so code could run
add_fitted_draws(pen_model, n = 100) |>
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
view(penguins_bayes) #looking for pablo species in dataset
#creating Pablo dataset
Pablo <- penguins_bayes |>
filter(species == "Gentoo") #used Gentoo species since Pablo isn't in the dataset
pablo_model <- stan_glm(
flipper_length_mm ~ bill_length_mm, data = Pablo,
family = gaussian,
prior_intercept = normal(51, 2.5, autoscale = TRUE),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 83)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.093814 seconds (Warm-up)
## Chain 1: 0.128274 seconds (Sampling)
## Chain 1: 0.222088 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 5e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.095912 seconds (Warm-up)
## Chain 2: 0.142301 seconds (Sampling)
## Chain 2: 0.238213 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 5e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.098053 seconds (Warm-up)
## Chain 3: 0.135969 seconds (Sampling)
## Chain 3: 0.234022 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 8e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.093807 seconds (Warm-up)
## Chain 4: 0.139249 seconds (Sampling)
## Chain 4: 0.233056 seconds (Total)
## Chain 4:
# 100 posterior model lines with wind speed
Pablo |> drop_na() |>
add_fitted_draws(pen_model, n = 100) |>
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
b) Density plots for all penguins and Pablo.
# Trace plots of parallel chains for penguin_bayes
mcmc_trace(pen_model, size = 0.1)
# Density plots of parallel chains
mcmc_dens_overlay(pen_model)
# Trace plots of parallel chains
mcmc_trace(pablo_model, size = 0.1)
# Density plots of parallel chains
mcmc_dens_overlay(pablo_model)
The chains look mostly stable and the density plots are narrow and show a normal distribution which increases confidence in these posteriors.
tidy(pen_model, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.80)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 127. 4.65 121. 133.
## 2 bill_length_mm 1.69 0.105 1.55 1.82
## 3 sigma 10.6 0.409 10.1 11.2
## 4 mean_PPD 201. 0.814 200. 202.
tidy(pablo_model, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.80)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 151. 6.83 142. 160.
## 2 bill_length_mm 1.39 0.143 1.21 1.57
## 3 sigma 4.90 0.322 4.52 5.34
## 4 mean_PPD 217. 0.627 216. 218.
The 80% CI for flipper length among all species is between 120.74 and 132.67. The 80% CI for flipper length is between 142.37 and 159.76 for the species designated as Pablo. Pablo has longer flipper lengths.For all penguins with 51 mm, their credible interval will be narrower. since that data set would match the prior average.
#posterior predictions of penguins and pablo
set.seed(51)
shortpen_prediction <-
posterior_predict(pen_model, newdata = data.frame(bill_length_mm = 51))
posterior_interval(shortpen_prediction, prob = 0.80)
## 10% 90%
## 1 198.9871 226.3826
#pablo
set.seed(51)
shortpab_prediction <-
posterior_predict(pablo_model, newdata = data.frame(bill_length_mm = 51))
posterior_interval(shortpab_prediction, prob = 0.80)
## 10% 90%
## 1 215.519 228.242
Their prior understanding is that as body mass increases so does flipper length by a small amount and with a small standard deviation.
data("penguins_bayes")
ggplot(penguins_bayes, aes(x = body_mass_g, y = flipper_length_mm)) +
geom_point(size = 0.1) +
geom_line(method = "lm", se = FALSE)
## Warning: Ignoring unknown parameters: method, se
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 2 row(s) containing missing values (geom_path).
As body mass increases flipper length also increases.
data("penguins_bayes")
ggplot(penguins_bayes, aes(x = flipper_length_mm, y = body_mass_g )) +
geom_point(size = 0.1) +
geom_line(method = "lm", se = FALSE)
## Warning: Ignoring unknown parameters: method, se
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 2 row(s) containing missing values (geom_path).
penbody_model <- stan_glm(flipper_length_mm ~ body_mass_g, data = penguins_bayes, family = gaussian,
prior_intercept = normal(0.01, .002),#scientists prior
prior = normal(3000, 500), #guessed prior
prior_aux = exponential(0.002),
chains = 4, iter = 5000*2, seed = 322)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 2.06347 seconds (Warm-up)
## Chain 1: 1.34832 seconds (Sampling)
## Chain 1: 3.41179 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 1.85629 seconds (Warm-up)
## Chain 2: 1.47952 seconds (Sampling)
## Chain 2: 3.33581 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 1.92248 seconds (Warm-up)
## Chain 3: 1.40347 seconds (Sampling)
## Chain 3: 3.32595 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 6e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 2.12993 seconds (Warm-up)
## Chain 4: 1.59456 seconds (Sampling)
## Chain 4: 3.72449 seconds (Total)
## Chain 4:
mcmc_trace(penbody_model, size = 0.1)
# Density plots of parallel chains
mcmc_dens_overlay(penbody_model)